Aiming at the problems that most feature selection algorithms do not fully consider class non-uniform distribution of data, the correlation between features and the influence of different parameters on the feature selection results, a feature selection method for imbalanced data based on neighborhood tolerance mutual information and Whale Optimization Algorithm (WOA) was proposed. Firstly, for the binary and multi-class datasets in incomplete neighborhood decision system, two kinds of feature importances of imbalanced data were defined on the basis of the upper and lower boundary regions. Then, to fully reflect the decision-making ability of features and the correlation between features, the neighborhood tolerance mutual information was developed. Finally, by integrating the feature importance of imbalanced data and the neighborhood tolerance mutual information, a Feature Selection for Imbalanced Data based on Neighborhood tolerance mutual information (FSIDN) algorithm was designed, where the optimal parameters of feature selection algorithm were obtained by using WOA, and the nonlinear convergence factor and adaptive inertia weight were introduced to improve WOA and avoid WOA from falling into the local optimum. Experiments were conducted on 8 benchmark functions, the results show that the improved WOA has good optimization performance; and the experimental results of feature selection on 13 binary and 4 multi-class imbalanced datasets show that the proposed algorithm can effectively select the feature subsets with good classification effect compared with the other related algorithms.
In Synthetic Minority Over-sampling TEchnique (SMOTE), noise samples may participate in the synthesis of new samples, so it is difficult to guarantee the rationality of the new samples. Aiming at this problem, combining clustering algorithm, an improved algorithm called Clustered Synthetic Minority Over-sampling TEchnique (CSMOTE) was proposed. In the algorithm, the idea of the linear interpolation between the nearest neighbors was abandoned, and the linear interpolation between the cluster centers of minority classes and the samples of corresponding clusters was used to synthesize new samples. And the samples involved in the synthesis were screened to reduce the possibility of noise samples participating in the synthesis. On six actual datasets, CSMOTE algorithm was compared with four SMOTE’s improved algorithms and two under-sampling algorithms for many times, and CSMOTE algorithm obtained the highest AUC values on all datasets. Experimental results show that CSMOTE algorithm has higher classification performance and can effectively solve the problem of unbalanced sample distribution in the datasets.
An improved Discrete Particle Swarm Optimization (DPSO) algorithm was proposed for solving the Flexible Flow Shop scheduling Problem (FFSP) with makespan criterion. The proposed algorithm redefined the operator of particle's velocity and position, and the encoding matrix and decoding matrix were introduced to represent the relationship between job, machine and scheduling. To improve the quality of initial population of the improved DPSO algorithm for the FFSP solution, by analyzing the relationship between the initial machine selection and the total completion time, a shortest time decomposition strategy based on NEH algorithm was proposed. The experimental results show that the algorithm has good performance in solving the flexible flow shop scheduling problem, and it is an effective scheduling algorithm.
To solve the sensor node localization problem of Wireless Sensor and Actor Network (WSAN), a range-based localization algorithm with virtual force in WSAN was proposed in this paper, in which mobile actor nodes were used instead of Wireless Sensor Network (WSN) anchors for localization algorithm, and Time Of Arrival (TOA) was combined with virtual force. In this algorithm, the actor nodes were driven under the action of virtual force and made themself move close to the sensor node which sent location request, and node localization was completed by the calculation of the distance between nodes according to the signal transmission time. The simulation results show that the localization success rate of the proposed algorithm can be improved by 20% and the average localization time and cost are less than the traditional TOA algorithm. It can apply to real-time field with small number of actor nodes.